Maxmin Data Range Heuristic-Based Initial Centroid Method of Partitional Clustering for Big Data Mining
نویسندگان
چکیده
The centroid-based clustering algorithm depends on the number of clusters, initial centroid, distance measures, and statistical approach central tendencies. centroid initialization defines convergence speed, computing efficiency, execution time, scalability, memory utilization, performance issues for big data clustering. Nowadays various researchers have proposed cluster techniques, where some techniques reduce iterations with lowest quality, increase quality high iterations. For these reasons, this study based Maxmin Data Range Heuristic (MDRH) method K-Means (KM) that reduces times, iterations, improves MDRH has compared against classical KM KM++ algorithms four real datasets. achieved better effectiveness efficiency over RS, DB, CH, SC, IS, CT quantitative measurements.
منابع مشابه
Uncertain Centroid based Partitional Clustering of Uncertain Data
Clustering uncertain data has emerged as a challenging task in uncertain data management and mining. Thanks to a computational complexity advantage over other clustering paradigms, partitional clustering has been particularly studied and a number of algorithms have been developed. While existing proposals differ mainly in the notions of cluster centroid and clustering objective function, little...
متن کاملthe clustering and classification data mining techniques in insurance fraud detection:the case of iranian car insurance
با توجه به گسترش روز افزون تقلب در حوزه بیمه به خصوص در بخش بیمه اتومبیل و تبعات منفی آن برای شرکت های بیمه، به کارگیری روش های مناسب و کارآمد به منظور شناسایی و کشف تقلب در این حوزه امری ضروری است. درک الگوی موجود در داده های مربوط به مطالبات گزارش شده گذشته می تواند در کشف واقعی یا غیرواقعی بودن ادعای خسارت، مفید باشد. یکی از متداول ترین و پرکاربردترین راه های کشف الگوی داده ها استفاده از ر...
Entropy-based Consensus for Distributed Data Clustering
The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...
متن کاملa swift heuristic algorithm base on data mining approach for the Periodic Vehicle Routing Problem: data mining approach
periodic vehicle routing problem focuses on establishing a plan of visits to clients over a given time horizon so as to satisfy some service level while optimizing the routes used in each time period. This paper presents a new effective heuristic algorithm based on data mining tools for periodic vehicle routing problem (PVRP). The related results of proposed algorithm are compared with the resu...
متن کاملDiagnosis of diabetes by using a data mining method based on native data
Background & Aim: Detecting the abnormal performance of diabetes and subsequently getting proper treatment can reduce the mortality associated with the disease. Also, timely diagnosis will result in irreversible complications for the patient. The aim of this study was to determine the status of diabetes mellitus using data mining techniques. Methods: This is an analytical study and its databas...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International journal of information retrieval research
سال: 2021
ISSN: ['2155-6377', '2155-6385']
DOI: https://doi.org/10.4018/ijirr.289954